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A  popular  approach  for  describing  the  structure  of  many  complex  networks  focuses  on  graph  theoretic 
properties  that  characterize  their  large-scale  connectivity.  While  it  is  generally  recognized  that  such  descrip¬ 
tions  based  on  aggregate  statistics  do  not  uniquely  characterize  a  particular  graph  and  also  that  many  such 
statistical  features  are  interdependent,  the  relationship  between  competing  descriptions  is  not  entirely  under¬ 
stood.  This  paper  lends  perspective  on  this  problem  by  showing  how  the  degree  sequence  and  other  constraints 
(e.g.,  connectedness,  no  self-loops  or  parallel  edges)  on  a  particular  graph  play  a  primary  role  in  dictating  many 
features,  including  its  correlation  structure.  Building  on  recent  work,  we  show  how  a  simple  structural  metric 
characterizes  key  differences  between  graphs  having  the  same  degree  sequence.  More  broadly,  we  show  how 
the  (often  implicit)  choice  of  a  background  set  against  which  to  measure  graph  features  has  serious  implica¬ 
tions  for  the  interpretation  and  comparability  of  graph  theoretic  descriptions. 
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INTRODUCTION 

The  recent  use  of  network  models  to  describe  complex 
systems  has  emphasized  the  study  of  graph  theoretic  proper¬ 
ties  as  a  means  to  characterize  the  similarities  and  differ¬ 
ences  in  structure  and  function  of  systems  across  a  variety  of 
domains  [1-7].  Considerable  effort  has  been  directed  both  at 
the  empirical  analysis  of  graph  theoretic  properties  of  real 
systems  and  at  the  development  of  generative  models  that 
attempt  to  explain  such  properties.  An  implicit  assumption  in 
much  of  this  work  is  that  graph  theoretic  properties  ad¬ 
equately  capture  key  system  features  in  order  to  serve  as  a 
basis  for  comparison  and  contrast. 

Notwithstanding  the  potential  pitfalls  of  reducing  a  com¬ 
plex  system  (e.g.,  one  that  may  involve  heterogeneous  com¬ 
ponents,  layered  architectures,  and  feedback  dynamics)  to  a 
simple  graph  [8,9],  there  exists  the  practical  problem  that 
many  descriptions  based  on  aggregate  statistics  do  not 
uniquely  characterize  the  system  of  interest.  In  fact,  there 
often  exists  considerable  diversity  among  graphs  that  share 
any  single  statistical  feature,  particularly  when  viewed 
through  the  lens  of  a  specific  application  domain.  For  ex¬ 
ample,  recent  work  on  the  router-level  Internet  has  shown 
that  there  is  enough  diversity  among  graphs  having  the  same 
power-law  node  degree  distribution  that,  although  indistin¬ 
guishable  when  viewed  by  this  aggregate  statistic,  these 
graphs  can  actually  be  interpreted  as  “opposites”  when 
viewed  from  an  engineering  perspective  that  incorporates 
technology  constraints  and  is  motivated  by  throughput  per¬ 
formance  [8,10,11]. 

The  purpose  of  this  paper  is  to  explore  this  notion  of 
graph  diversity  and  characterize  more  completely  the  way  in 
which  the  degree  sequence  of  a  particular  graph  dictates 
many  popular  graph  features,  including  its  correlation  struc¬ 
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ture.  Furthermore,  this  paper  emphasizes  the  importance  of 
choosing  an  appropriate  “background  set”  when  evaluating  a 
graph,  as  well  as  the  importance  of  making  sure  that  the 
comparative  analysis  of  two  graphs  is  conducted  with  respect 
to  an  appropriate  reference.  In  this  regard,  we  show  that  not 
all  graph  theoretic  measures  have  an  obvious  interpretation 
or  are  directly  comparable. 

DEGREE  SEQUENCE  AND  GRAPH  DIVERSITY 

For  a  graph  with  n  vertices,  let  denote  the  degree  (i.e., 
number  of  connections)  of  vertex  /,  l^i^n,  and  call  D 
={di,d2, ...  the  degree  sequence  of  the  graph,  assumed 
without  loss  of  generality  always  to  be  ordered  di^d2 
^  ^  Within  the  space  of  all  graphs  having  n  vertices, 

let  Q{D)  denote  the  considerably  smaller  subset  of  graphs 
having  particular  degree  sequence  D. 

Not  all  sequences  of  integers  D  correspond  to  realizable 
graphs.  One  well-known  characterization  of  whether  or  not  a 
sequence  D  corresponds  to  a  simple,  connected  graph  is  due 
to  Erdos  and  Gallai  [12],  who  observed  that  a  sequence  of 
positive  integers  di,d2, . . .  ,d^  with  ^  J2 ^  ^ is 

graphical  if  and  only  if  is  even  and  for  each  integer  k, 

1 

k  n 

2  dj  ^  k{k  -  1)  +  2  niin(/:,  Jy) . 

7-1  7=^+1 

Recent  work  has  further  reduced  the  number  of  sufficient 
conditions  to  be  checked  [13],  and  several  algorithms  have 
been  developed  to  test  for  the  existence  of  a  graph  satisfying 
a  particular  degree  sequence  D  [14]. 

The  restriction  to  graphs  having  a  particular  degree  se¬ 
quence  has  been  considered  previously  in  the  context  of 
graph  generation  mechanisms  [2,15].  In  particular,  the  con¬ 
figuration  model  (CM)  [2,16,17]  often  serves  as  the  null  hy¬ 
pothesis  of  networks  having  a  particular  degree  sequence, 
since  it  yields  graphs  that  are  maximally  random  (in  the 


1539-3755/2007/75(4)/046102(ll) 


046102-1 


©2007  The  American  Physical  Society 


Report  Documentation  Page 

Form  Approved 

0MB  No.  0704-0188 

Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 

VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  0MB  control  number. 

1.  REPORT  DATE 

AUG  2006  2.REPORTTYPE 

3.  DATES  COVERED 

00-00-2006  to  00-00-2006 

4.  TITLE  AND  SUBTITLE 

Diversity  of  graphs  with  highly  variable  connectivity 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

6.  AUTHOR(S) 

5d.  PROJECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Naval  Postgraduate  School, Operations  Research 

Department, Monterey, CA, 93943 

8.  PEREORMING  ORGANIZATION 

REPORT  NUMBER 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

10.  SPONSOR/MONITOR’S  ACRONYM(S) 

11.  SPONSOR/MONITOR’S  REPORT 
NUMBER(S) 

12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited 

13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT 

15.  SUBJECT  TERMS 

16.  SECURITY  CLASSIEICATION  OE:  17.  LIMITATION  OE 

ARSTRAUT 

18.  NUMBER  19a.  NAME  OE 

OE  PAGES  RESPONSIBLE  PERSON 

a.  REPORT  b.  ABSTRACT  c.  THIS  PAGE  Sume  US 

unclassified  unclassified  unclassified  Report  (SAR) 

11 

Standard  Form  298  (Rev.  8-98) 

Prescribed  by  ANSI  Std  Z39-18 


DAVID  L.  ALDERSON  AND  LUN  LI 


PHYSICAL  REVIEW  E  75,  046102  (2007) 


sense  of  maximum  entropy)  while  conforming  to  a  specified 
degree  sequence  D.  In  what  follows,  we  will  always  restrict 
attention  to  graphs  with  a  specified  D. 

In  considering  the  structural  features  of  a  particular  graph, 
we  leverage  previous  work  [18]  and  define,  for  any  graph  g 
having  fixed  degree  sequence  D,  the  s  metric 

^  (1) 

where  A  =  Yaij\  is  the  vertex  adjacency  matrix  for  the  graph, 
and  V  and  £  denote  the  sets  of  all  vertices  and  edges  in  the 
graph,  respectively.  Accordingly,  we  assume  without  loss  of 
generality  that  the  number  of  vertices  and  edges  in  the  graph 
are  represented  by  n  =  \V\  and  l=\£\,  respectively.  Note  that 
the  summation  in  (1)  is  easily  computed  for  any  graph  and 
does  not  depend  on  the  process  by  which  it  was  constructed. 
Implicitly,  the  metric  5’(g)  measures  the  extent  to  which  the 
graph  g  has  a  hublike  core  and  is  maximized  when  high- 
degree  vertices  are  connected  to  other  high-degree  vertices. 

In  general,  the  set  Q{D)  will  have  many  elements  exhib¬ 
iting  a  range  of  5'  values.  Within  this  space,  we  define  the 
and  graphs  within  Q{D)  as  those  having  the  maximum 
and  minimum  s{g)  values,  respectively.  To  facilitate  the  deri¬ 
vation  of  these  values,  we  introduce  the  vector 


value  may  deviate  considerably  from  the  upper  bound,  the 
relationship  in  (4)  holds  with  approximate  equality  and  typi¬ 
cally  the  value  deviates  from  the  lower  bound  by  only  a 
single  pair  of  edges,  if  at  all. 

It  is  easy  to  see  that  the  value  can  be  rewritten  as 

n  n 

*max  =  S  W/2)y,.)2  =  W)'/2,  (5) 

/=1  /=1 

which  is  achieved  in  effect  by  creating  primarily  self-loops 
among  the  vertices  in  the  network  and  then  connecting  the 
remaining  stubs  in  order  of  decreasing  (see  Appendix  A  of 
[18]  for  details).  To  the  best  of  our  knowledge,  there  does  not 
exist  a  comparable  analytic  formula  (or  interpretation)  for  the 
■Smin  graph  in  g{D). 

Many  graphs  of  practical  interest  have  additional  condi¬ 
tions  imposed  by  functional  or  domain  constraints,  such  as  a 
requirement  to  be  connected  or  a  restriction  against  self¬ 
loops  or  multiple  connections.  Thus,  in  our  investigation  we 
also  consider  the  restricted  set  of  all  simple  and  connected 
graphs  having  the  same  degree  sequence  D,  which  we  denote 
as  G{D).  Note  that  G{D)(ZQ{D)  and  that  most  randomly 
generated  graphs  with  particular  D  will  be  neither  simple  nor 
connected,  so  this  is  an  important  and  nontrivial  restriction. 
From  these  definitions  it  follows  that 


<i|  elements  ^2  elements  elements 

Z  =  {Jj,  . . .  ,di,d2,  . . .  ,d2,  . . .  ,d^,  . . .  ,df}  (2) 


G{D)  Q{D)  ^ 


n 

S  dj  elements 
/=1 

which  is  simply  derived  from  the  original  degree  sequence 
D.  The  “^min  valucs  within  Q{D)  can  be  described  in 

terms  of  Z  in  the  following  manner.  Since  G{D)  only  requires 
its  elements  to  satisfy  the  degree  sequence  D  (and  ignores 
issues  such  as  connectedness,  multiple  edges,  etc.)  it  is  easy 
to  show  that  within  Q{D)  one  has 

■Smax  «  (3) 

with  equality  achieved  in  practice  only  under  certain  circum¬ 
stances  (e.g.,  when  the  elements  of  D  are  all  even  or  there  is 
an  even  number  of  elements  having  any  particular  odd 
value).  This  observation  follows  from  the  rearrangement  in¬ 
equality  [19],  which  states  that  if  ai^a2^'"^a^  and  bi 
^ /?2 ^  ^ then  for  any  permutation  {a[,a2, ...  ,a'^)  of 

(^1, (22, ...  ,(2j,  we  have 

aibi  +  (22/?2  +  *  “  +  G^b^  ^  a[bi  +  (22/^2  +  *  “  +  G^b^ 

^  a^bi  +  (2„_iZ?2  +  *  “  +  Gib^. 

Accordingly,  it  follows  that 

■^min  ^  ^ZZ^,  (4) 

where  Z  is  simply  the  vector  Z  with  elements  in  reverse 
order.  However,  unlike  the  case  in  (3)  where  equality  is 
achieved  in  practice  only  sometimes  and  where  the  actual 


Although  bounding  values  for  the  minimum  and  maximum 
elements  of  Q(D)  can  be  directly  obtained  from  Eqs.  (3)  and 
(4),  obtaining  and  values  within  the  restricted  space 
G(D)  is  more  complicated. 

Given  a  particular  degree  sequence  D,  it  is  possible  to  use 
a  deterministic  procedure  in  order  to  construct  the  graph 
in  G(D).  The  details  of  this  construction  procedure  are  pre¬ 
sented  in  [18],  but  the  basic  idea  is  to  order  all  potential  links 
(ij)  for  all  iJ^V  according  to  their  weights  d^dj  and  then 
add  them  one  at  a  time  in  a  manner  that  results  in  a  simple, 
connected  graph  having  degree  sequence  D.  While  simple 
enough  in  concept,  this  type  of  “greedy”  heuristic  procedure 
may  have  difficulty  achieving  the  intended  sequence  D  due 
to  the  global  constraints  imposed  by  connectivity  require¬ 
ments,  but  it  works  well  in  practice  for  most  graphs  (again, 
see  [18]  for  details).  Obtaining  the  Vin  value  is  less  exact, 
and  it  is  easy  to  show  that  the  graph  is  not  unique. 
Whitney  and  Alderson  [20]  have  recently  used  a  heuristic 
approach,  originally  proposed  by  Maslov  and  Sneppen  [21], 
which  employs  a  Metropolis-like  algorithm  based  on  succes¬ 
sive  rewiring  to  obtain  values  within  G(D).  Unfortu¬ 
nately,  this  method  is  inefficient  and  does  not  reliably  obtain 
the  actual  value.  However,  in  practice  one  finds  that 
so  in  the  remainder  of  this  paper  we 
use  the  value  defined  in  (4),  as  an  approximate  (and 
more  conservative)  bounding  value  for  . 

As  a  measure  of  graph  structure,  the  metric  provides  a 
simple  means  for  contrasting  the  differences  between  graphs 
having  the  same  degree  sequence,  and  in  this  paper  we  use  it 
exclusively  as  a  means  for  measuring  the  diversity  within  a 


046102-2 


DIVERSITY  OF  GRAPHS  WITH  HIGHLY  VARIABLE... 


PHYSICAL  REVIEW  E  75,  046102  (2007) 


particular  space  of  graphs.  In  particular,  the  extreme  points 
‘^max  serve  as  meaningful  reference  points  for  indi¬ 

vidual  graphs  and  the  space  as  a  whole,  and  for  a  given  D  the 
difference  *s'jnax“  *^111111  provides  a  measure  of  how  different  the 
absolute  extremes  are.  Using  this  perspective,  it  is  not  hard 
to  see  that  the  amount  of  diversity  for  graphs  having  a  par¬ 
ticular  D  is  related  to  the  amount  of  variability  within  the 
sequence  D  itself.  Here,  we  characterize  variability  with  the 
standard  measure  of  (sample)  coefficient  of  variation  (Cy), 
which  for  a  given  sequence  D  =  {di,d2, ...  ,d^)  is  defined  as 

CyiD)  =  aiD)/{d),  (6) 

where  is  the  average  vertex  degree,  and  we 

measure  deviations  of  the  d^  from  its  average  (d)  using  the 
sample  standard  deviation  o-(D)  =  [E^^j(J^-(J))^/(^-l)]^^^. 

For  graphs  with  regular  structure  that  have  low  variability 
in  their  degree  sequence  D,  there  is  typically  very  little  di¬ 
versity  in  the  corresponding  space  of  graphs  G{D) .  Consider 
as  an  extreme  example,  a  one-dimensional  lattice  (i.e.,  a 
chain)  with  the  degree  sequence  ^chain={2 ,2,2,  ...,2,1,1}. 
One  can  easily  show  that  for  a  chain  consisting  of  n  nodes 

^v(^chain)  “  -  1)^^^ 

and  thus  Cy(Dchain)^0  It  is  easy  to  see  that  there 

is  no  diversity  among  graphs  having  degree  sequence  Dchain^ 
since  all  /t-node  chains  are  isomorphic  to  one  another  in 
G(D)  and  thus  ^niin  =  'ymax- 

For  sequences  D  with  increasing  Cy{D),  graph  diversity 
as  measured  by  the  range  *s'jnax“ *^111111  ^Iso  increases.  Here,  we 
leverage  two  classes  of  graphs  as  reference  points.  For 
graphs  with  a  degree  sequence  having  an  exponential  form, 
for  constant  c>0  (denoted  here  as  Dgxp)’  one  ob¬ 
serves  that  Cy(Z)exp)^^  constant)  as  In  contrast, 

the  scale-free  graphs  [22] — so  called  because  their  degree 
sequences  exhibit  a  scaling  relationship  of  the  form  kd^=c, 
for  all  where  c>0  and  a>0  are  constants,  and 

where  determines  the  range  of  scaling  [23] — exhibit  di¬ 
vergent  Cy.  It  is  easy  to  show  that  degree  sequences  ^scaling 
with  a<2  follow  Cy(D scaling) ^ ^  As  we  will 

show  below,  these  classes  of  graphs  yield  degree  sequences 
with  measurably  different  levels  of  diversity. 

Although  one  might  expect  that  graph  diversity  simply 
increases  with  Cy(D),  this  need  not  be  the  case.  Consider  a 
star  consisting  of  a  single  central  node  that  connects  to  all 
others  and  having  degree  sequence  ,1,1,...,!}. 

One  can  similarly  show  that 

^  ^  n^'Hn-2) 

Cv(£>star)-  ’ 

and  thus  Cy(Dstar)^^  However,  as  for  the  chain, 

there  is  no  diversity  among  graphs  having  degree  sequence 
^star  ^11  stars  are  isomorphic  to  one  another  in  G{D)  and 
*^111111  “  ‘^maxl  • 

In  order  to  make  the  previous  discussion  more  concrete, 
we  now  consider  a  simple  experiment  to  investigate  the  role 
of  Cy(D)  in  determining  the  diversity  for  graphs  having  par¬ 


ticular  D.  For  purposes  of  exposition,  we  begin  with  a  study 
of  acyclic  graphs  (i.e.,  trees)  and  then  later  comment  on  how 
our  results  apply  to  general  graphs.  Our  experiment  uses  in¬ 
cremental  growth  via  preferential  attachment  as  described  in 
[24],  in  which  each  newly  added  node  connects  to  an  exist¬ 
ing  node  k  with  probability 


n(k)  =  h 


(7) 


where  is  again  the  degree  of  node  k,  and  p  is  a  parameter 
that  tunes  the  attachment  mechanism.  The  resulting  graph  is 
simple  and  connected,  and  thus  an  element  of  G(D),  al¬ 
though  the  degree  sequence  D  that  is  realized  will  vary  from 
trial  to  trial.  Clearly,  p  =  0  is  equivalent  to  uniform  attach¬ 
ment  (resulting  in  L>exp)’  while  p=l  is  equivalent  to  linear 
preferential  attachment  used  in  the  Barabasi- Albert  model 
[3]  (resulting  in  Dgcaimg)-  A  similar  type  of  model  was  also 
considered  in  [25].  Note  also  that  as  each  newly  added 

node  attaches  to  the  maximum  degree  node  (resulting  essen¬ 
tially  in  L>star)’  while  as  p^-cc  each  newly  added  node  at¬ 
taches  to  the  minimum  degree  node  (resulting  essentially  in 
Dchain)-  III  what  follows,  we  first  restrict  attention  to  the  case 
where  b=l  (i.e.,  we  generate  acyclic  graphs)  and  consider  a 
range  of  values  for  p  in  order  to  generate  graphs  having  a 
variety  of  degree  sequences.  We  defer  results  on  general 
graphs  until  the  end. 

Figure  1  shows  the  result  of  an  experiment  in  which  for 
each  trial  we  generate  a  tree  having  ^=100  nodes  using  pref¬ 
erential  attachment  rule  given  by  (7).  That  is,  each  trial  re¬ 
sults  in  a  tree  having  its  own  degree  sequence  D  and  5'  value. 
In  generating  these  graphs,  we  use  various  attachment  expo¬ 
nents  p,  but  only  for  the  purpose  of  realizing  graphs  with  a 
diversity  of  degree  sequences.  In  what  follows  we  focus  pri¬ 
marily  on  the  degree  sequence  D  and  the  constraints  it  places 
on  the  space  of  graphs,  not  the  attachment  exponent  p  that 
led  to  D.  For  each  degree  sequence  D,  we  then  calculate 
Cyip)  as  well  as  the  corresponding  *^111111  values  as 

described  above.  The  resulting  picture  in  Fig.  1(a)  shows  a 
striking  relationship  between  Cyip)  and  the  range  of  pos¬ 
sible  5' -values.  One  observes  that,  while  the  ^-j^ax  *^111111 
values  increase  with  Cyip)  for  both  the  unconstrained  space 
Q{D)  and  the  constrained  space  G(D),  the  differences  given 
by  ‘S’jnax"  *^111111  for  each  space  behave  differently  at  the  maxi¬ 
mal  values  of  Cy{D).  Specifically,  this  difference  within  the 
unconstrained  space  Q{D)  increases  with  Cy(D),  but  it  is 
zero  at  both  extremes  of  Cy{D)  for  the  simple,  connected 
graphs  in  G{D)  (again,  the  limiting  cases  of  a  chain  and  a 
star).  It  is  also  worth  noting  that  the  values  for  and 
are  so  close  as  to  be  indistinguishable,  further  supporting  our 
choice  to  treat  these  values  as  equivalent.  Figure  1(b)  pre¬ 
sents  the  same  information  for  and  within  G(D),  but 
normalizes  the  values  for  each  graph  against  its  respective 
“^max  value,  thus  resulting  in  a  feasible  range  [0,  1]  for  each 
graph.  Collectively,  this  suggests  that  for  a  given  degree  se¬ 
quence  one  needs  enough  variability  to  enable  diversity 
among  simple,  connected  graphs  but  that  too  much  variabil- 
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EIG.  1.  (Color  online)  Three  views  of  graph  diversity.  In  this 
experiment  trees  of  size  w  =  100  were  generated  according  to  attach¬ 
ment  rule  (7)  for  different  values  of  p.  (a)  Eor  each  resulting  tree, 
we  plot  and  values  in  both  G{D)  and  G(D)  versus  the 
Cy(p)  of  the  corresponding  degree  sequence.  Note  that 

G(  Hiiii 

^‘^min  •  “^min  and  in  G(D),  each  normalized  by  their 

respective  (c)  The  corresponding  and  r^^ax  values  in  G{D). 
In  all  cases,  the  vertical  lines  correspond  to  the  upper  and  lower 
limits  of  Cy  for  an  acyclic  graph  having  100  nodes  [i.e., 
^v(^chain)  =  0-0711  and  Cy(Dstar)  =  4.9495]. 


ity  actually  becomes  a  constraint  within  the  space  G{D), 
something  that  Maslov  et  al  [26]  have  described  as  essen¬ 
tially  a  finite- size  effect. 

Although  it  is  now  well  understood  that  there  can  be 
many  graphs  having  the  same  degree  sequence  and  that  these 
graphs  may  have  considerable  structural  differences,  quanti¬ 
fying  these  differences  and  their  implications  in  terms  of  real 
systems  remains  the  topic  of  active  research.  Previous  work 
by  Li  et  al  [18]  has  shown  that  the  v  metric,  and  in  particular 
the  i-j^ax  graph  within  G{D),  is  relevant  for  many  commonly 
studied  graph  properties.  First,  high-degree  nodes  in  the  Vj^ax 
graph  have  high  centrality,  and  for  trees  this  relationship  was 
shown  to  be  monotonic.  Second,  Vj^ax  graphs  are  self- similar 
under  appropriately  defined  operations  of  trimming  and 
coarse  graining.  Finally,  the  i-j^ax  graph  has  the  highest  like¬ 
lihood  of  being  generated  by  the  generalized  random  graph 
(GRG)  model  [15].  As  already  noted,  other  work  by  Li  et  al 
[10]  has  shown  that,  in  modeling  the  router-level  Internet, 
the  observed  degree  sequences  in  real  networks  allow  for 
dramatic  diversity  in  candidate  models,  particularly  when 
measured  in  terms  of  throughput  performance.  A  previously 
unanswered  question  was  whether  this  diversity  is  inherent 
in  all  networks,  and  here  we  have  shown  that  it  depends  to 
some  extent  on  the  degree  sequence  of  the  network  in  ques¬ 
tion. 

Taken  by  itself,  this  observation  is  neither  groundbreaking 
nor  surprising.  For  some  time,  there  has  been  a  general  rec¬ 
ognition  in  the  literature  that  the  degree  sequence  of  a  graph 
can  provide  only  a  simplistic  characterization  of  its  proper¬ 
ties,  and  this  has  led  many  researchers  to  consider  more  so¬ 
phisticated  descriptions  of  graph  structure.  Most  notable  has 
been  an  emphasis  on  various  forms  of  correlation  in  network 
connectivity,  ranging  from  simple  notions  of  network  clus¬ 
tering  (i.e.,  connectivity  correlations  between  vertex  triplets) 
to  more  general  degree-degree  correlations  [also  called  the 
joint  degree  distribution  (JDD)]  and  spectral  methods.  There 
is  now  a  growing  literature  on  the  importance  of  correlation 
structure  in  networks  [2,27-31]  and  how  to  generate  net¬ 
works  having  particular  correlation  structure  [25,32-34].  A 
simple  measure  of  correlation  structure  that  has  appeared  ex¬ 
tensively  in  the  literature  is  the  Pearson  coefficient  r  (known 
more  generally  as  the  correlation  coefficient  [35])  which  is 
used  to  quantify  the  average  tendency  of  vertices  to  connect 
to  others  having  similar  degree.  It  turns  out  that  there  is  an 
inherent  relationship  between  the  Pearson  coefficient  and  the 
V  metric,  and  a  closer  look  at  this  relationship  yields  consid¬ 
erable  insight  into  both  the  diversity  within  the  background 
set  G{p)  as  well  as  the  interpretation  of  r  itself. 

GRAPH  ASSORTATIVITY  RECONSIDERED 

Recently,  Newman  [36]  introduced  the  following  sample- 
based  measure  of  graph  assortativity  as  defined  by  the  Pear¬ 
son  coefficient: 

S  didJl]-(  E  hdi  +  dd/l] 

ijUS  )  \(,j).g2  / 

2  'zid^i'^d^)ll\-i  2  ~^di  +  dj)ll\ 

This  relationship  can  be  written  as 
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(a)  5  =  29876, 
“  0.386, 
S  -  0.022, 
r  =  -0.4815. 


(h)  5  =  33959, 
”  0.439, 
S  -  0.106, 
r  =  -0.47G6. 


(c)  5  =  60271, 
”  0.  /  79, 
S  =  0.648, 
r  ^  -0.4449. 


(d)  5  =  74010, 
=  0.957, 
S  =  0.931, 
r  =  -0.4283. 


FIG.  2.  Four  graphs  with  the 
same  degree  sequence  but  increas¬ 
ing  values  of  5.  As  originally  pre¬ 
sented  in  [18],  these  networks 
have  the  same  (power-law)  degree 
distribution,  but  here  the  degree- 1 
nodes  have  been  omitted.  The  la¬ 
bel  on  each  node  indicates  its  total 
degree.  The  degree  sequence  for 
these  graphs  yields  826 

and  s^^^=ll  350  within  G(D). 


r{g)  = 


(9) 


where  the  first  term  of  the  numerator  is  exactly  v(^).  Al¬ 
though  the  Pearson  coefficient  is  only  a  summary  statistic  for 
the  correlation  profile  of  the  graph  as  a  whole,  it  provides 
interesting  information  nonetheless  and  is  often  cited  as  a 
key  feature  distinguishing  various  classes  of  complex  net¬ 
works  [4,27,36,37]. 

Here,  we  argue  that  r{g)  has  a  natural  interpretation  as  a 
centered  and  normalized  version  of  v(g).  In  particular,  ob¬ 
serve  that  the  first  term  of  the  denominator  in  (9)  is  exactly 
the  Vjnax  value  within  the  space  G{D)  as  defined  in  (5).  Ac¬ 
cordingly,  one  can  rewrite  the  Pearson  coefficient  as 


r(8)  = 


Hg)  -  Hgc) 

‘^max  ‘^V6c/ 


(10) 


where  we  refer  to  as  the  center  of  the  space  ^(D). 

The  reason  that  g^  can  be  viewed  as  the  center  of  this 
space  of  graphs  is  discussed  in  the  online  supplement  to  our 
previous  work  [18].  The  key  idea  is  that  a  deterministic 
graph  in  G(D)  with  zero  assortativity  has  exactly  the  same  v 
value  as  s(gj,  equal  to  More  specifically,  con¬ 

structing  such  a  deterministic  graph  with  zero  assortativity 
means  connecting  a  vertex  to  any  other  vertex  in  a  manner 
that  is  proportional  to  each  vertex’s  degree.  This  can  be  re¬ 
alized  using  a  pseudograph  g^  in  which  the  elements  of  the 
adjacency  matrix  A  =  [a^^]  are  non-negative  real  numbers  rep¬ 
resenting  link  weights  and  satisfying 


JeV  ieV  ^ 

showing  that  v(g^)=v(gc)-  Note  that  the  GRG  method  [15] 
can  be  interpreted  as  a  stochastic  procedure  that  generates 
real  graphs  from  the  zero-assortativity  pseudograph  with 
the  one  important  difference  that  the  GRG  method  always 
results  in  simple  (but  not  necessarily  connected)  graphs.  It 
has  recently  been  shown  that  the  statistical  ensemble  of 
graphs  resulting  from  the  stochastic  GRG  method  has  zero 
assortativity  [39]. 

Thus,  the  Pearson  coefficient  r  (as  a  summary  statistic  of 
graph  assortativity)  captures  a  fundamental  feature  of  graph 
structure,  one  that  is  closely  related  to  our  v  metric.^  That  r 
reflects  v  is  obvious  from  its  definition,  but  the  question  is 
whether  a  consideration  of  v  by  itself  provides  insight.  The 
key  observation  is  that  the  existing  notion  of  assortativity  for 
an  individual  graph  g  is  implicitly  measured  against  a  back¬ 
ground  set  of  graphs  Q(D)  that  is  not  constrained  to  be  either 
simple  or  connected.  As  we  show  next,  because  r  is  com¬ 
puted  relative  to  an  unconstrained  background  set,  in  some 
cases  this  normalization  (against  the  unconstrained  Vj^^x 
graph)  and  centering  (against  the  g^  pseudograph)  does  a 
relatively  poor  job  of  distinguishing  among  graphs  having 
the  same  degree  sequence,  particularly  when  that  degree  se¬ 
quence  exhibits  high  variability.  Figure  2  shows  four  graphs 
having  the  same  degree  sequence,  but  with  very  different 
connectivity  patterns.  These  graphs  were  originally  con¬ 
structed  as  contrasting  representations  of  the  router-level  In¬ 
ternet  (see  [18],  Fig.  5),  but  are  presented  here  in  a  manner 
that  highlights  their  diversity.  Specifically,  one  observes  that 
although  they  have  nearly  the  same  assortativity  as  defined 
by  r,  their  structural  differences  are  highlighted  by  5  and  its 


did; 

aij=—-  =  aji. 

Zj  dk 

keV 


By  extension,  the  5  metric  for  the  pseudograph  g^  is  calcu¬ 
lated  as 


^Indeed,  the  Pearson  coefficient  is  typically  viewed  as  simply  the 
correlation  coefficient  of  the  joint  distribution  P{k,k')  that  a  ran¬ 
domly  selected  link  in  the  network  will  connect  vertices  having 
degree  values  k  and  k' .  In  this  context,  the  “centering  term”  is 
simply  the  squared  average  of  the  marginal  distribution  of  P{k,k'), 
and  the  denominator  of  (8)  is  the  square  root  of  the  standard 
deviation. 
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TABLE  L  Sensitivity  of  assortativity  among  graphs  having  low  Cy{D).  Each  graph  shown  has  n  nodes 
and  n-\  links.  In  the  limit  where  minimal  differences  in  graph  structure,  as  measured  by  Cy{D)  and 

the  ratio  translate  to  large  differences  as  measured  by  the  Pearson  coefficient  r. 


S  /  Sj 


. 

-0 

- 1 

-0 

. ^ 

-0 

- 1 

--1 

^ - - 

-0 

- 1 

-0 

normalized  values  and  S(g)  defined  as 

5(g)  (11) 

‘^max  ‘^min 

In  cases  where  network  performance  is  measured  by  the 
maximum  throughput  under  fixed  node  capacities,  these 
structural  differences  translate  to  big  differences  in  perfor¬ 
mance  [18]. 

For  additional  insight  into  the  way  in  which  differences  in 
translate  to  differences  in  r,  we  extend  the  previous  com¬ 
putational  experiment  to  values  of  ^min  within  the 

constrained  background  set  G{D).  Note  that  these  values  can 
be  computed  directly  from  the  corresponding  values  of 
and  5'jnin-  In  Fig.  1(c)  we  show  these  values  for  each  of  the 
generated  graphs  in  our  experiment.  There  are  several  strik¬ 
ing  features  of  this  plot.  The  first  is  that  the  normalization  of 
the  V  metric  in  the  calculation  of  the  Pearson  coefficient  r 
dramatically  changes  the  sense  of  graph  diversity  among 
graphs  having  a  particular  D.  For  values  of  relatively  high 
Cy(D),  r<0  and  seems  largely  independent  of  any  diversity 
as  measured  by  the  range  in  allowable  5'.  In  other  words,  a 
second  important  conclusion  is  that  all  networks  with  high 
Cy(D)  have  r  <  0  and  this  seems  largely  a  function  of  D  and 
not  any  particular  feature  of  the  graph  or  whether  it  is  a 
technological  or  social  network  as  argued  in  [37].  This  idea 
has  been  made  previously  in  [7,26,29,33,38]  and  has  also 
been  recently  argued  [20]  based  largely  on  empirical  obser¬ 
vations  of  real  networks  having  a  range  of  r  values.  A  third 
important  result  is  that  for  small  values  of  Cy(D)  one  ob¬ 
serves  that  small  diversity  as  measured  by  *^111111  trans¬ 
lates  to  a  large  range  of  can  see  this  more 

clearly  with  the  simple  example  in  Table  I,  which  illustrates 
the  sensitivity  of  r  to  small  changes  in  topology.  Thus,  for 
graphs  that  are  simple  and  connected,  the  Pearson  coefficient 
r  can  both  hide  structural  diversity  as  well  as  display  false 
diversity. 

It  is  worth  noting  that  although  r(g)  =  I  is  achieved  ap¬ 
proximately  by  the  graph  within  Q{D)  for  all  graphical 
D,  it  is  only  in  very  special  instances  of  D  where  the 
graph  is  obtained.  Specifically,  when  ,  then  it  fol¬ 

lows  that  r{g)  =  -l  if  and  only  if  Zk-^Zk=z  (a  constant)  for 
each  of  the  k  pairs  of  elements.  In  other  words,  although  it  is 


true  that  for  arbitrary  D,  one  often  observes  that 

simply  because  of  the  degree  sequence  D  itself.  A 
proof  of  this  appears  in  the  Appendix. 

Based  on  this  analysis,  one  might  reasonably  conclude 
that  the  Pearson  coefficient  r  is  not  a  suitable  metric  for 
comparing  the  correlation  structure  of  graphs  from  different 
domains.  Indeed,  it  is  well  understood  that  a  more  accurate 
approach  is  to  consider  higher-order  forms  of  correlation.  Yet 
the  deeper  question  relates  to  how  one  should  evaluate  any 
observed  correlation  structure.  Recent  efforts  by  several  au¬ 
thors  have  warned  against  graph  theoretic  analysis  of  net¬ 
works  in  isolation.  For  example,  Maslov  et  al.  [21,26]  have 
argued  that  a  real  assessment  of  a  network’s  correlation 
structure  makes  sense  only  when  compared  against  its  ran¬ 
domized  counterpart.  In  the  context  of  rich-club  ordering  in 
complex  networks  (i.e.,  the  tendency  of  high-degree  vertices 
to  connect  to  one  another),  Colizza  et  al  [40]  have  also 
argued  that  the  presence  of  high-degree  vertices  in  a  given 
network  is  enough  to  ensure  that  high-degree  vertices  are 
connected,  and  they  similarly  argue  for  the  need  to  compare 
the  features  of  any  subject  network  to  a  randomized  baseline. 
Thus,  important  questions  include:  What  is  the  appropriate 
baseline  against  which  to  compare  graphs?  and  how  does  this 
relate  to  the  background  set  of  graphs,  as  defined  by  G{D)  or 
G{D)1 


MEASURING  AGAINST  BACKGROUND  SETS 

The  previous  sections  provide  enhanced  understanding  of 
the  way  in  which  a  given  D  constrains  the  possible  s  and  r 
values  a  graph  can  have,  and  they  also  suggest  that  when 
making  statements  about  a  graph  based  on  these  graph  prop¬ 
erties  one  must  consider  the  background  set  against  which 
these  properties  are  being  evaluated.  Here,  we  expand  this 
viewpoint  by  considering  the  way  in  which  a  graph  with 
given  D  compares  within  the  space  of  graphs  bounded  by 
Vjnin  and  Vjnax  valucs.  We  furthermore  consider  the  location  of 
randomized  graphs  within  this  space. 

As  above,  our  approach  here  is  largely  empirical,  and  we 
again  leverage  our  previous  numerical  experiment  in  gener¬ 
ating  graphs  via  incremental  growth  according  to  an  attach¬ 
ment  exponent  p.  For  a  given  value  of  /?,  we  generate  a  graph 
having  n  vertices  with  resulting  degree  sequence  D.  Then, 
for  that  particular  D  we  construct  the  and  graphs 
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within  G{p).  We  also  compute  the  theoretical  upper  bound 

(3)  and  lower  bound  (4)  on  within  Q{p).  We  then  obtain 
appropriately  randomized  graphs  having  degree  sequence  D 
in  two  ways.  First,  we  generate  m  =  500  new  graphs  accord¬ 
ing  to  the  configuration  method.  Also,  we  consider  the  pro¬ 
cess  of  degree-preserving  rewiring  on  the  original  graph. 

Graph  rewiring  is  effective  as  a  conceptual,  as  well  as 
computational,  means  for  exploring  the  space  of  graphs  hav¬ 
ing  the  same  degree  sequence  D.  Since  exchanging  the  end 
points  of  any  two  links  does  not  alter  the  degrees  of  the 
affected  vertices  (and  hence  leaves  the  overall  degree  se¬ 
quence  unchanged),  this  approach  has  been  a  popular  tool  for 
investigating  the  effects  of  local  topological  changes  on  glo¬ 
bal  graph  properties  [18,21,26]  as  well  as  a  means  for  gen¬ 
erating  graphs  having  a  specified  degree  sequence  and  addi¬ 
tional  properties  (i.e.,  connectedness)  [41,42].  Here,  we 
consider  degree-preserving  rewiring  as  a  means  for  moving 
within  the  space  of  graphs  having  degree  sequence  D.  In 
previous  work  [18],  we  have  used  the  number  of  successive 
rewiring  steps  between  two  graphs  as  a  measure  of  distance 
in  the  space  G(D);  however,  in  this  study  we  restrict  atten¬ 
tion  to  the  distribution  of  values  within  the  possible  range 
5'max“  *^111111  for  both  G{D)  and  Q{D).  In  the  aforementioned 
extreme  examples  of  a  chain  and  star,  any  degree-preserving 
rewiring  operation  that  precludes  disconnection  or  self-loops 
yields  a  graph  that  is  isomorphic  to  the  original,  and  again 
shows  that  there  is  no  diversity  in  either  case. 

Figure  3  shows  the  results  of  three  representative  numeri¬ 
cal  experiments  exploring  the  distribution  of  graphs  having 
particular  ^'-values  for  a  specified  D.  Figure  3(a)  resulted 
from  uniform  attachment  (i.e.,  p  =  ^)  and  corresponds  to  the 
case  of  Dgxp  having  low  variation  {here,  Cy(D)  =  0.6380 
within  the  possible  range  [0.0711,4.9495]  for  acyclic  graphs 
having  ^=100  nodes}.  Figure  3(b)  resulted  from  linear  pref¬ 
erential  attachment  (/?—!)  and  corresponds  to  the  case  of 
^scaling  [here,  Cy(Z))  =  1.4121].  Figure  3(c)  resulted  from  su- 
perlinear  preferential  attachment  (i.e.,  p>\)  and  corresponds 
to  a  case  with  high  variability  [here,  Cy(Z))  =  2.5141].  For 
each  case,  the  graph  within  G{D)  was  obtained  by  the 
construction  mechanism  described  previously,  while  the 
value  was  obtained  from  (4).  The  leftmost  graph  for  each 
case  corresponds  to  an  approximate  graph  obtained  heu- 
ristically.  From  these  results,  several  observations  are  imme¬ 
diately  clear. 

(1)  For  each  particular  D,  there  are  considerable  differ¬ 
ences  between  the  and  i-j^ax  graphs.  In  all  cases,  the 
graph  looks  very  chainlike  and  the  ^-j^ax  graph  looks  very 
starlike. 

(2)  The  range  of  feasible  values  for  graphs  in  Q{D)  is 
considerably  larger  than  the  range  for  G(Z)),  and  this  differ¬ 
ence  increases  with  greater  Cy{D). 

(3)  The  differences  between  the  graphs  in  each  case  are 
less  obvious  when  evaluated  using  the  Pearson  coefficient  r 
[normalized  against  the  graphs  in  the  unconstrained  space 
0(D)]  but  are  emphasized  when  evaluated  using  normalized 

values  (i.e.,  either  or  S).  Thus,  when  comparing 

among  elements  of  G(D),  the  Pearson  coefficient  sometimes 
tends  to  hide  the  structural  differences  rather  than  highlight 
them.  Similar  observations  were  made  previously  in  [18,20]. 


(4)  Although  rewiring  within  the  space  G(D)  yields  a  dis¬ 
tribution  of  graphs  that  theoretically  span  the  entire  space, 
using  rewiring  to  obtain  graphs  having  extreme  values  is 
difficult  to  achieve  in  practice.  The  implications  for  using 
rewiring  as  a  means  to  obtain  an  ensemble  of  graphs  is  un¬ 
clear.  Moreover,  it  is  unclear  what,  if  anything,  one  can  say 
about  the  original  graph  for  each  case  based  on  its  placement 
within  the  feasible  range  of  graphs  for  G(D). 

(5)  As  expected,  there  is  good  correspondence  in  all  cases 
between  the  distribution  of  graphs  resulting  from  rewiring  in 
the  unconstrained  space  G{D)  and  those  generated  from  the 
configuration  method.  Furthermore,  the  distribution  of  these 
graphs  appears  largely  centered  on  r=0,  as  would  be  pre¬ 
dicted  since  it  was  shown  that  the  CM  approach  results  in 
zero-assortativity  graphs  (in  expectation). 

(6)  The  distribution  of  graphs  in  G{D)  is  consistently 
shifted  toward  larger  values  than  those  in  G(D).  As  Cy 
increases,  the  differences  between  the  distribution  of  graphs 
in  G{D)  and  QiD)  becomes  more  extreme,  to  the  point  where 
all  of  the  graphs  generated  within  G(D)  have  values  larger 
than  can  be  achieved  by  the  graph  of  G(D).  In  other 
words,  for  large-Cy  degree  sequences,  none  of  the  graphs 
generated  by  the  CM  or  resulting  from  rewiring  within  G(D) 
correspond  to  simple,  connected  graphs  [i.e.,  elements  in 
G(D)l 

In  practice,  when  considering  graphs  having  high  Cy,  we 
advocate  the  use  of  or  S  as  measures  of  diversity 

when  considering  graphs  that  are  simple  and  connected.  For 
graphs  that  are  not  simple  or  connected,  the  Pearson  coeffi¬ 
cient  r  provides  insight  into  the  diversity  within  Q{D). 

These  observations  yield  several  important  conclusions. 

First,  graphs  that  arise  from  different  contexts  may  not  be 
directly  comparable  using  structural  metrics  that  are  inher¬ 
ently  computed  against  different  background  sets.  In  consid¬ 
ering  the  above  examples,  one  observes  that  the  approximate 
i-min  graph  in  Fig.  3(b)  [i.e.,  Cy(D)  =  1.41]  translates  to  r 
=  -0.45  while  the  Vax  graph  for  Fig.  3(c)  [i.e.,  Cy(D)  =  2.51] 
translates  to  r=-0.43.  A  naive  look  at  the  Pearson  coefficient 
suggests  that  they  are  similarly  assortative,  although  the 
graph  in  Fig.  3(b)  has  the  minimal  r  value  and  the  graph  in 
Fig.  3(c)  has  the  maximal  r  value. 

Second,  the  differences  between  the  unconstrained  space 
Q{D)  and  the  space  of  simple,  connected  graphs  G{D)  may 
be  more  important  in  determining  graph  properties  than  other 
features  as  measured  by  aggregate  statistics.  Specifically,  the 
use  of  graph  generation  techniques  such  as  the  configuration 
method,  even  if  they  replicate  the  measured  degree  sequence 
of  a  real  network,  may  be  entirely  inappropriate  if  the  do¬ 
main  under  study  requires  simple  and  connected  graphs.  This 
strengthens  previous  results  on  the  importance  of  these  addi¬ 
tional  restrictions  as  reported  in  [26,43]. 

Third,  while  it  is  clear  that  the  evaluation  of  a  graph  based 
on  its  structural  properties  may  be  appropriate  only  in  rela¬ 
tion  to  the  corresponding  background  set,  understanding  the 
implication  of  those  structural  features  (e.g.,  in  terms  of 
function)  remains  an  open  question.  For  example,  it  remains 
unclear  what,  if  anything,  the  relative  placement  of  a  graph 
within  the  range  [^min  ^ ‘^maxl  actually  says  about  the  graph 
itself. 
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EIG.  3.  (Color  online)  Diver¬ 
sity  among  graphs  having  the 
same  degree  sequence:  (a)  uni¬ 
form  attachment  (p  =  0),  (b)  ap¬ 
proximate  linear  attachment  (p 
~  1),  (c)  superlinear  attachment 
(;7>1).  In  each  case,  a  single 
graph  with  w  =  100  vertices  was 
generated  using  a  different  prefer¬ 
ential  attachment  exponent  and  re¬ 
sults  in  a  different  degree  se¬ 
quence  D.  The  corresponding 
and  5jnax  graphs  were  also  ob¬ 
tained  for  both  G{D)  and  G{D). 
Each  node  is  labeled  with  its  de¬ 
gree,  with  degree- 1  nodes  omitted 
for  simplicity.  Also  shown  for 
each  is  the  distribution  of  graphs 
within  the  space  G{D)  (from  re¬ 
wiring)  and  within  G{D)  (from  re¬ 
wiring  and  generated  via  the  CM). 


DISCUSSION 

An  inherent  challenge  in  the  study  of  graph  diversity  is 
that  the  combinatorics  of  even  relatively  small  networks 
typically  result  in  a  space  of  graphs  that  is  incredibly  large. 
In  this  study,  we  have  focused  on  graphs  having  ^=100 
(which  are  about  the  largest  that  can  be  visualized  easily)  for 
purposes  of  exposition,  and  even  here  a  comprehensive 
analysis  of  the  elements  in  G{D)  and  G{D)  is  challenging.  In 
choosing  preferential  attachment  as  our  primary  means  for 
graph  generation,  we  have  tried  to  keep  our  methods  closely 


tied  to  the  literature  so  that  they  may  be  easily  replicated.  An 
alternate  approach  could  have  been  to  identify  specific  de¬ 
gree  sequences  D  for  which  graph  isomorphism  reduces  the 
number  of  unique  graphs  to  a  small  handful  and  the  entire 
space  of  graphs  (not  just  Vj^ax  ‘^miJ  is  easily  visualized. 
Identifying  and  exploring  such  examples  may  represent  an 
important  step  in  future  work. 

The  overall  message  of  the  results  here  is  that  one  must 
carefully  consider  the  inherent  diversity  of  graphs  sharing  a 
particular  statistical  measure  when  making  claims  based  on 
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(a)  (b)  (c) 


FIG.  4.  (Color  online)  Graph  diversity  among  nontrees.  In  this  experiment,  an  additional  k{n-l)  links  were  added  to  initial  trees  of  size 
^  =  100.  (a)  k=l,  {d)  =  3.96,  Cy^^=3.4451.  (b)  k=2,  {d)  =  5.94,  Cy^^=2J612.  (c)  k=4,  {d)  =  9.9,  Cy^^=2.010l.  In  the  bottom  graphs,  variation 
is  measured  with  Cy{D)  while  in  the  top  graphs  it  is  represented  as  the  normalized  Cy{D)ICy^^{D). 


any  such  statistic.  Nonetheless,  additional  work  is  required  to 
understand  fully  the  way  in  which  graph  diversity  affects 
such  characterizations.  While  others  have  argued  for  the  need 
to  compare  against  a  randomized  version  of  the  graph,  here 
we  have  compared  against  the  entire  feasible  region,  as  mea¬ 
sured  by  the  range  The  examples  here  seem  to 

suggest  that  the  distribution  of  graphs  within  either  G{D)  or 
Q(D)  is  not  uniform,  and  a  general  characterization  of  these 
distributions  is  unknown.  Ideally,  one  would  like  to  know 
more  about  where  the  randomized  graph  sits  within  the  over¬ 
all  space  (i.e.,  is  it  the  center  of  this  space?).  Moreover,  there 
may  be  important  differences  between  graph  properties  that 
are  imposed  by  structural  constraints  (e.g.,  by  the  degree 
sequence  D)  and  those  relative  to  what  has  been  randomized. 

Although  this  study  provides  additional  insight  into  the 
way  in  which  graph  diversity  affects  one’s  ability  to  use  ag¬ 
gregate  statistics  for  characterizing  complex  networks,  it  has 
done  so  primarily  for  acyclic  graphs  (i.e.,  trees),  and  more 
work  is  required  to  understand  the  extent  to  which  these 
same  results  hold  for  more  general  network  structures.  How¬ 
ever,  we  now  present  preliminary  empirical  evidence  that 
suggests  the  story  for  nontrees  is  qualitatively  the  same. 

In  Fig.  4,  we  show  the  results  of  a  final  experiment  in 
which  we  again  generate  trees  having  ^=100  nodes  accord¬ 
ing  to  attachment  rule  (7)  for  a  range  of  exponents  p.  How¬ 
ever,  to  each  tree  having  an  initial  l=n-l  links  we  then  add 
an  additional  kl  links  by  choosing  end  points  probabilisti¬ 
cally  in  correspondence  with  (7).  In  this  manner,  we  gener¬ 
ate  graphs  having  n  nodes  and  a  degree  sequence  D  satisfy¬ 
ing  E^J^=2(/:-hI)(^-I)  [i.e.,  the  average  degree  is  (d) 
~  2(^+1)].  Empirical  evidence  [4]  suggests  that,  for  many 
real  networks,  (d)  <10.  For  each  degree  sequence  D,  we  then 


compute  the  corresponding  ^min^  ^max  values  as 

was  done  previously.  Figure  4  shows  these  values  plotted 
against  the  variation  of  D,  represented  again  as  Cy{D)  and 
also  now  normalized  as  Cy{D)ICy^\D)  for  purposes  of 
comparison. 

One  observes  for  graphs  with  increasing  average  degree 
[( J)  ~  4 , 6 , 10  in  Figs.  4(a)-4(c),  respectively]  that  Cy(D)  de¬ 
creases  overall  but  the  relative  shape  of  the  space  of  graphs 
within  G(D),  as  defined  by  the  range  remains 

qualitatively  consistent  with  that  of  trees.  However,  the  total 
variation  as  measured  by  the  distance  between  (i-j^ax 
-5'min)/‘5’max  decreases  with  increasing  link  density.  At  the 
same  time,  for  graphs  with  increasing  link  density  and  hav¬ 
ing  degree  sequence  with  Cy^^(D),  the  difference  ^s-j^ax"  *^111111 
is  no  longer  zero  in  general,  indicating  inherent  diversity 
even  at  higher  levels  of  variation.^  Graph  assortativity  as 
measured  by  the  range  [rj^m^r^naxl  is  also  qualitatively  the 
same  as  for  trees,  in  that  high  Cy{D)  is  enough  to  dictate  that 
r<0  but  considerable  diversity  exists  for  low  values  of 
Cy{D).  Although  such  results  are  not  conclusive,  we  view 
them  as  generally  supportive  of  graph  diversity  as  we  have 
discussed  it  here. 

Finally,  while  this  paper  has  focused  on  degree  sequences 
and  has  used  the  5'  metric  to  highlight  the  differences  in 
graphs  sharing  the  same  D,  we  conjecture  that  a  similar  story 
is  apt  to  apply  to  other  graph  metrics  (even  higher-order  ones 


2 

However,  when  the  degree  sequence  D  corresponds  to  a  multistar 
(e.g.,  double  star,  triple  star),  the  overall  picture  in  the  upper  row  of 
Fig.  4  looks  the  same,  except  that  the  ^min/'^max  values  jump 
abruptly  to  1  at  Cy^^(Z)),  since  all  multistars  are  isomorphic  to  one 
another  in  G(D). 
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like  the  JDD).  A  detailed  exploration  of  these  issues  for  other 
metrics  will  be  important  in  the  development  of  new  graph 
analysis  and  generation  techniques. 
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APPENDIX 

In  order  to  see  when  a  degree  sequence  D  can  achieve 
r(g)  =  -l,  we  introduce  a  simplified  version  of  the  Cauchy- 
Schwarz-Burnyakovskii  inequality,  which  states  that  for  any 
vector  {/?! ,^^2, . . .  it  must  be  that 


with  the  equality  holding  if  and  only  if  bi  =  b2=’  "=b^. 

Applying  this  inequality  to  a  graph  with  I  links,  it  follows 
that 


2  {di  +  dj)^  ^  ji  2  ^7))  • 

Expanding  the  squared  term  on  the  left-hand  side  and  divid¬ 
ing  both  sides  by  2,  we  have  from  relations  (8)  and  (9)  that 

E  2didj/2+  E  {d^  +  dj)/2^^(  E  idi  +  dj)Y, 


■y(^)  + 


g{D) 

max 


■^(g)  -  ■^(gc)  ^ 

max  ^  cJ 


which  is  simply  another  way  of  showing  that  r{g)  ^  -1 ,  but  it 
proves  that  r(g)  =  -l  if  and  only  if  di+dj=d  (a  constant)  for 
all  {ij)  G  £. 

Recall  that  within  Q{D)  one  has  s^:^^=Z^-Z  as  defined  by 
(4),  and  thus  this  graph  corresponds  to  r=-l  if  and  only 
if  for  each  element  k  one  has  Zk+Zk=z  (a  constant). 
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